Corelab Seminar
2020-2021
Lydia Zakynthinou
Conditional PAC-Bayes and Mutual Information Generalization Bounds: Fast rates that handle general VC classes
Abstract.
PAC-Bayesian as well as mutual information (MI) generalization error bounds are standard approaches that have proven very useful in the study of the generalization properties of machine learning algorithms. However, recent work has shown a deficiency of such bounds: there exist VC classes for which MI and PAC-Bayes generalization error bounds must remain trivial (Bassily et al. 2018, Livni and Moran 2020), even though the theory of uniform convergence guarantees generalization in this case. To avoid the impossibility result for MI, we have introduced a new framework, that of conditional MI (Steinke and Zakynthinou 2020), which unifies MI, uniform convergence (VC), differential privacy and other approaches.
We now give a novel, unified derivation of conditional PAC-Bayesian and MI generalization bounds, which allows us to get nontrivial bounds for general VC classes. Moreover, it allows us to get fast rates of order O((COMPLEXITY/n)^γ) for γ>1/2 if a Bernstein condition holds and for exp-concave losses (with γ=1), which is impossible with both standard PAC-Bayes and MI generalization bounds.
Joint work with Peter Grünwald (CWI) and Thomas Steinke (Google Brain).